LO-LRU: fix UpdateLruList early-return bug + add large↔small state-flip tests#452
Draft
LO-LRU: fix UpdateLruList early-return bug + add large↔small state-flip tests#452
Conversation
…copilot/add-large-obj-page-state-tests
|
|
…ly-return bug Co-authored-by: xiexiaoy <5558602+xiexiaoy@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Add tests for page large_obj_page state transitions
LO-LRU: fix UpdateLruList early-return bug + add large↔small state-flip tests
Mar 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Supplements PR #426 (LO-LRU) with state-flip coverage and fixes a partition-invariant bug discovered during test authoring.
Bug fix —
CcShard::UpdateLruList(cc_shard.cpp)When a page already sitting just-before-tail had
large_obj_page_flippedfalse→true(small→large), the early-return path skipped theprotected_head_page_update. Result:protected_head_page_stayed attail_ccp_even with a large page in the list, putting that page in the "small" eviction segment — the opposite of what LO-LRU intends.Fix: in the early-return block, if policy is LO_LRU and the page is now large and
protected_head_page_==tail, setprotected_head_page_=page.New tests —
LargeObjLRU-Test.cppVerifyLruListInvarianthelper — walks the full LRU list asserting:protected_head_page_istailor a large page[head, protected_head)contains only small pages[protected_head, tail)contains only large pageslru_prevback-pointer is consistentSix flip scenarios:
protected_head_page_resets totailprotected_head_page_advances to next largeprotected_head_page_set to that page (exercises bug fix)protected_head_page_unchangedEnsureLargeObjOccupyPageAlone, entry deletedprotected_head_page_resetsEnsureLargeObjOccupyPageAlone, single entry exceeds thresholdprotected_head_page_updated (exercises bug fix)Minor
"Invalidate"→"Invalid"intx_service_init.cpperror log.fixes eloqdb/tx_service#issue_id./mtr --suite=mono_main,mono_multi,mono_basicOriginal prompt
在 repo eloqdata/tx_service 的 PR #426 (Support optional large object LRU policy) 上补足单测覆盖“状态翻转(page large_obj_page_ true/false 的翻转)”场景。
背景/现状:
需求:
在 tests/LargeObjLRU-Test.cpp 中新增测试用例(或新增测试文件亦可,但优先放在现有 LargeObjLRU-Test.cpp),覆盖至少以下两类“状态翻转”场景,并且要验证 LRU 分区不变量:
测试应尽量使用 PR 里已有的测试 helper(例如 VerifyLruList、CreatePage、InsertEntry/Emplace 等,如果存在),并在断言中明确检查:
测试必须在 Debug/Release 都稳定:不要依赖触发 assert 来证明正确性。
如为达成测试需要,对生产代码做最小必要的可测试性改动是允许的(例如暴露只读访问器来读取 protected_head_page_ 或提供测试 hook),但要:
交付:
bazel test/现有 test suite(按 repo 现状的构建系统)通过;The following is the prior conversation context from the user's chat exploration (may be truncated):
User: 根据这个PR描述的设计,仔细深刻地review这个PR
Assistant: 下面是基于 PR #426(“Support optional large object LRU policy”)描述的设计与当前代码改动,做一次“按设计逐条对照 + 深挖边界/并发/一致性/可维护性”的 review。重点会放在:LO-LRU 分区不变量是否真的被所有路径维护、entry 迁移是否满足 lock/状态契约、以及 ccmp_/page key/LRU 指针在迁移时是否可能被破坏。
1) 先复述一下 PR 设计(我理解的“不变量”)
PR 描述的核心目标:
protected_head_page_:lolru_large_obj_threshold_kb),要确保它所在 page 是 “large_obj_page_ == true” 且 page 上只有一个 object(如果原 page >1 entry 就把该 entry 迁移到新 page)。EnsureLargeObjOccupyPageAlone()来维护上面不变量。配置项:
cache_evict_policy: LRU | LO_LRU(默认 LRU)lolru_large_obj_threshold_kb(默认 1024KB)2) 代码改动总览(结构上你做了什么)
2.1 配置与策略开关
core/src/tx_service_init.cppcache_evict_policy、lolru_large_obj_threshold_kbtx_service_->SetupPolicyLRU()tx_service_->SetupPolicyLoLRU(threshold_kb)tx_service/include/type.hCacheEvictPolicyenum + 两个 policy struct(带字符串 NAME)tx_service/include/cc/local_cc_shards.hSetupPolicyLoLRU用large_obj_threshold_kb * 1024ull保存 bytes(这一点你已经修掉了 CodeRabbit 提醒的潜在溢出)2.2 Page/Entry 迁移基础设施
tx_service/include/cc/cc_entry.hLruPage增加bool large_obj_page_{false}CcPage增加:Emplace(key, unique_ptr<CcEntry>):用于迁移后的插入(并维护 smallest_ttl_ / last_dirty_commit_ts_)MoveEntry(idx):move 出 unique_ptrUpdateSmallestTTL():迁移/删除后重算最小 TTL2.3 LO-LRU 分区指针与插入规则
tx_service/include/cc/cc_shard.h/tx_service/src/cc/cc_shard.cppLruPage* protected_head_page_,初始化为&tail_ccp_UpdateLruList()在 LO_LRU 下:protected_head_page_前(从而不会跑到 large 段之后)2.4 EnsureLargeObjOccupyPageAlone(核心)
tx_service/include/cc/template_cc_map.hEnsureLargeObjOccupyPageAlone(ccp, cce):shard_->UpdateLruList(ccp, false)(对原 page 也更新 LRU)2.5 ObjectCcMap 多处调用 EnsureLargeObjOccupyPageAlone
tx_service/include/cc/object_cc_map.hThis pull request was created from Copilot chat.
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.